Co-STAR: A Co-training Style Algorithm for Hyponymy Relation Acquisition from Structured and Unstructured Text

نویسندگان

Jong-Hoon Oh

Ichiro Yamada

Kentaro Torisawa

Stijn De Saeger

چکیده

This paper proposes a co-training style algorithm called Co-STAR that acquires hyponymy relations simultaneously from structured and unstructured text. In CoSTAR, two independent processes for hyponymy relation acquisition – one handling structured text and the other handling unstructured text – collaborate by repeatedly exchanging the knowledge they acquired about hyponymy relations. Unlike conventional co-training, the two processes in Co-STAR are applied to different source texts and training data. We show the effectiveness of this algorithm through experiments on largescale hyponymy-relation acquisition from Japanese Wikipedia and Web texts. We also show that Co-STAR is robust against noisy training data.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bilingual Co-Training for Monolingual Hyponymy-Relation Acquisition

This paper proposes a novel framework called bilingual co-training for a largescale, accurate acquisition method for monolingual semantic knowledge. In this framework, we combine the independent processes of monolingual semanticknowledge acquisition for two languages using bilingual resources to boost performance. We apply this framework to largescale hyponymy-relation acquisition from Wikipedi...

متن کامل

Extracting Hyponymic Relations from Chinese Free Corpus_Finally 分栏精简版_5.rtf

Research on hyponymy acquisition is a basic and crucial problem in knowledge acquisition from text. In this paper we present a method of hyponymic relation acquisition and verification based on Chinese lexico-syntactic patterns. Firstly, we make use of removable lexicons and sentence patterns that have been semi-automatically obtained to analyze Chinese-isa patterns. Then we use an algorithm th...

متن کامل

Discovering Multi Terms and Co-hyponymy from XHTML Documents with XTREEM

The Semantic Web needs ontologies as an integral component. Current methods for learning and enhancing ontologies, need to be further improved to overcome the knowledge acquisition bottleneck. The identification of concepts and relations with only minimal user interaction is still a challenging objective. Current approaches performed to extract semantics often use association rules or clusterin...

متن کامل

A hybrid neural–genetic algorithm for predicting pure and impure CO2 minimum miscibility pressure

"> Accurate prediction of the minimum miscibility pressure (MMP) in a gas injection process is crucial to optimizing the management of gas injection projects. Because the <span style="font-size: 10pt; ...

متن کامل

Acquiring Hyponymy Relations from Web Documents

This paper describes an automatic method for acquiring hyponymy relations from HTML documents on the WWW. Hyponymy relations can play a crucial role in various natural language processing systems. Most existing acquisition methods for hyponymy relations rely on particular linguistic patterns, such as “NP such as NP”. Our method, however, does not use such linguistic patterns, and we expect that...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2010

Co-STAR: A Co-training Style Algorithm for Hyponymy Relation Acquisition from Structured and Unstructured Text

نویسندگان

چکیده

منابع مشابه

Bilingual Co-Training for Monolingual Hyponymy-Relation Acquisition

Extracting Hyponymic Relations from Chinese Free Corpus_Finally 分栏 精简版_5.rtf

Discovering Multi Terms and Co-hyponymy from XHTML Documents with XTREEM

A hybrid neural–genetic algorithm for predicting pure and impure CO2 minimum miscibility pressure

Acquiring Hyponymy Relations from Web Documents

عنوان ژورنال:

اشتراک گذاری

Extracting Hyponymic Relations from Chinese Free Corpus_Finally 分栏精简版_5.rtf